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Abstract 

We prove a Chernoff-type upper variance bound for the multinomial and the negative 
multinomial distribution. 
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1 Introduction 

Let X be a standard normal distribution and g be an absolutely continuous function, with 
a.s. derivative g' . Chernoff [14] proved that Varg(Z) < E(g'(Z))^ provided that Wj{g'{Z)f 
is finite, where the equality holds if and only if 51 is a polynomial of degree at most one; 
see also the previous papers by Nash [20], Brascamp and Lieb [7]. This inequality has been 
generalized and extended by many authors (see, e.g., [13, 8, 9, 10, 19, 18, 22, 6, 16, 15, 17, 
4, 5, 12, 23, 2, 3, 1, 24]). 

Let X be an integer- valued random variable (r.v.) with finite mean /x, finite variance 
and probability mass function (p.m.f.) p. And let the function w be defined by 

Z]j<x(a* ~ j)p{j) = (y'^w{x)p{x) for all x G 

In case where u) is a quadratic polynomial (of degree at most 2) and for any suitable func- 
tion g, defined on the support of X, Cacoulos and Papathanasiou [9] proved that (see also 
Afendras et al. [4]) 

Var<7(X) < a^^w{X)[l^g[X)]\ (1.1) 

where A is the forward difference operator. Furthermore, the following Stein-type covariance 
identity holds (see Cacoullos and Papathanasiou [9], Afendras et al. [5]) 

Cov[X,g{X)]=a'^^w{X)^g{X). (1.2) 

Cacoullos and Papathanasiou [11] extended this identity for discrete multivariate distribu- 
tions, see Appendix A, and established a lower variance bound for the variance of g{X), 
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where 5 is a suitable real function defined on the support of X. For the multinomial and 
the negative multinomial distribution this bound takes the form 

YavgiX) > ^w{X)V'g{X)) ^F.{w{X)Wg{X)), 

where ^ is the dispersion matrix of X, the function w is given by (2.3) for the multinomial 
case and by (2.5) for the negative multinomial case, and is the grad of g (see Definition 
2.1). 

Chen [13] extended Chernoff 's bound to the multivariate normal case. Specifically, let 
X be the fe-dimensional normal distribution N^^fJ^, Then, 

Var<7(X) < E{V'g{X)^Vg{X)), 

where Vg{X) = {dg{x)/dxi, . . . ,dg{x)/dxkf' is the grad of g (cf. Definition 2.1(a,b)). 

In this note we extend (1.1) for multrinomial and negative multinomial distributions. 
Specifically, we prove that 

Var5(X) < W.{w{X)V^ g{X) ^ V g{X)) . 

2 Preliminaries 

The following definitions will be used in the sequel. 

Definition 2.1 Consider the vectors x = (xi, . . . , XkY S R'^' and tt = (vri, . . . , vr^)* € (0, 1)^, 
a non-negative integer v and a real function g defined on M.^. We define: 

(a) gi{x) = Aig{x) := g{x + e^) — g{x), where Bi is the i-th vector of the standard or- 
thonormal basis ofM.^; 

(b) W'g{x) = {Vg{x))' := {gi{x), g2{x), . . .,gk{x)). 

(c) TT- := <i • • • 7^^^ 

(d) (1^) := iy\/[xil • • • Xfc!(z/ — xi — • • • — Xfc)!], provided that x Gf^^ with Yli=i — ^■ 

(e) x_k ■■= {xi,...,Xk-iY e R''"^. 

Definition 2.2 Let X = (Xi, . . . ,Xfc)* be a discrete random vector. We denote by: 

(a) b(n,7r) the binomial distribution with p.m.f. p{x) = (")7r^(l — vr)""^, x = 0,1, . . . ,n, 

(b) nb(r, vr) the negative binomial distribution with p.m.f. p{x) = (''"^^~^)7r'^'(l — vr)^, x = 
0,1,..., (r =]N\{0};. 

(c) mfc(n, tt) the k-dimensional multinomial distribution with p.m.f. p{x) = (!^)7r^7rQ°, x € 
in'"' with X]i=i < n, where xq := n — X]i=i ^i; S (Oi 1)'^ t^^^ t^o '■= ^ — J2i=i '^i > 0- 
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(d) nmk{r,9) the k- dimensional negative multinomial distribution with p.m.f. p{x) = 
C^^h' X G ]N^ where 6 G (0, 1)*= and 9o := I - Eti &i > 0- 

(e) pk{x) = PXkixk) and P-k{x) = px.^ix-k) the p.m.f. s of marginal and X^k of 
X, respectively, and p_i.\f^{x) = px_,.\x,^=Xk{'^-k) the p.m.f. of the conditional r.v. 
X^k\Xk = Xk. 

Let X mk{n,7v). We define the function w{x) = wx{x) by 

I 0, wlien n = 0, 

where ttq as above. Notice that in the case n = for each h the r.v. h{X) is a constant 
with prob. 1 [Var/i(X) = 0]; thus, we define w = Q. It is obvious that Xk ~ b(n, vTfc) and 
= Xfc ~ mfc_i(n - Xfc,-!^), where ix7 G (0, l)''"^ with Wi = i = 1, . . . , A; - 1. 

Thus, we define the functions Wk{x) = wx^i^k) and w_^j^{x) = Wx_^^\Xk=xki^-k) by 

I 0, when n = 0, [ 0, when Xk = n, 

Let X ~ nmfc(r, 0). Then Xk ~ nb(r,'!9fc), where -i^fc = 0^^, and = Xfc ~ 

nmfc_i(r + x^, O^k)- Thus, similarly, we define the functions w, Wk and w_^]^ by 

.(.):= Ml±^, .,(.):= ^ and u..,,(.) := (^"^^^K^^f ^0 . (^.5) 
For both cases (multinomial and negative multinomial distribution) one can easily see that 
P-k\k{x + ek) = p^k\k{x)w^k\k{x) and Wk{x)w_k\k{x) = w{x). (2.6) 
Next, we prove the following useful lemma. 

Lemma 2.1 Let X ~ mfc(n, tt) or nmfc(r, 6). Consider a real function g defined on support 
of X such that 'Ej\Xjg[X)\ and 'Ej\Xjgi{X)\ are finite for all i,j = 1, . . . ,k. 

(a) The following covariance identity holds 



Gov 



Xi,g{X)] =^w{X)Y:Uc,g,{X)), (2.7) 



where w is given by (2.3) or (2.5), respectively, and a = Yl^=i'^ij with aij = Cow {Xi,Xj). 
(b) The next identity is valid (for multinomial only when Xk < n) 

Akn9{X)\Xk] =B[w^k\k{X){gk{X) + akEti (^i\kgiiX))\Xk], (2-^ 

where Ci\k = 1]^=! ^-jilfc with aij\k = CoY{Xi,Xj\Xk) and Uk = a{Xk) is -(1 - 7rfc)/[7ro(n - 
Xk)] for the multinomial and is {9o + Ok)/{r + Xk) for the negative multinomial . 
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Proof (a) For multinomial and negative multinomial distributions the identity (A. 10) is 
valid. Note that for both cases w^{x) = w{x), for each i = 1, . . . ,k, where the function 'w{x) 
is given by (2.3) or (2.5), respectively (see Cacoullos and Papathanasiou [11, pp. 178-179]). 
So, 

Cov[q\X),giX)]=Ew{X)gi{X), for all i = l,...,k. 

Now, by (A.8) it follows that X = ^q{X). Therefore, ELi = ELi^i^H^)- Combin- 
ing the above relations (2.7) follows. 

(b) We write AfcEb(X) = E[g(X) jX^ + 1] - E[5(X) . Using (2.6), AfcE[r7(X) = 
E [w_u\k{X)gk{X)\Xk\+'E [w^k\k{X)g{X)\Xk\ -E[5(X)|Xfc]. Since E [«;„fc|,,(X)|Xfc] = 1, 
it follows that AfeE[^/(X)|Xfc] = E [w^k\u{X)gk{X)\Xk\ + Gov [w_k\k{X),g{X)\Xk\. From 
(2.3) and (2.5) we observe that w_^i^{X) = Yl\=i ^i+f^ki where (5^ = P{Xk) is a constant 
in Xi, . . . ,Xk-i. Thus, 

Akng{X)\Xk] = -E[w_k\k{X)gk{X)\Xk] + q^Cov [ ^ti 5(^)l^fc] ■ 

Finally, from the conditions on g it follows that Wj\Xjg{X)\Xii | < oo, for all j = 1, . . . , A; — 1, 
and E|Xj(7j(X)|Xfc| < oo for all i, j = 1, . . . , k — 1. Thus, using (2.7) for X_i^\^f^ the lemma 
is proved. □ 



3 The main result 

We are now in a position to state and prove the main result. 

Theorem 3.1 Let X mfc(n, tt) or nmfc(r, 0). Consider a function g defined on support 
of X; for the negative multinomial assume further that Y&Tg{X) is finite. Then, 

Varg(X) < E [w{X)V'g{X) ^ Vg{X)\ , (3.1) 

where f is the dispersion matrix of X and the function w for the multinomial is given by 
(2.3) and for the negative multinomial is given by (2.5). The equality in (3.1) holds if and 
only if g is a linear function with respect to Xi, . . . , x^, i-e. of the form g{x) = Po + EiLi Pi^i- 

Proof If E [w{X)\7^g{X)^\7g{X)] is infinite the relation (3.1) is trivial. Assume that 
Wi^w{X)V^ g{X) ^ V g{X)'j is finite and thus the conditions of Lemma 2.1 are valid. The 
proof will be done by induction on k. For k = 1 (3.1) holds, see (1.1). Assuming that (3.1) 
is valid for k — 1 for some A; > 1, we will prove that (3.1) is also valid for k. It is well know 
that 

YaigiX) = E[Var(5(X)|Xfc)] + Var [E{g{X)\Xk)] . (3.2) 
Using (1.1) for X^ it follows that 

Var [E{g{X)\Xk)] < fT^E[^«fc(X) {AkE{giX)\Xk))^], (3.3) 

where = VarXfc. Note that Wk{X)\x^.=n = 0. Thus, from (2.8) it follows that 
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Since E[?i;_/-|fc(X)|X/c] = 1, using Cauchy-Schwartz inequality it follows that 

[w_k\k{X) {gk{X) + Ofc Y.i=i Ci\kgi{X)) \Xk] 

r fc-1 2| 1 (3-4) 

<'E\^w_kikiX){gk{X) + ak^i^^CiikgiiX)) \Xk . 

Using (2.6), 

E[Var(g(X)|Xfc)] 

< EE[alw{X){gl{X) + 2ak THzI c,\kg^{X)gk{X) + {akTHzl Ci\kgi{X)f)\Xk] 

= E[alw{X){gl{X) + 2ak Eti Ci\kg^{X)gk{X) + ( X:^!^ q|,ff,(X))')] . (3.5) 

By the inductional assumption (3.1) (with — 1 in place of k) it follows that 

X^r{g{X)\Xk) < E[w^k\k{X)V\g{X)^^k\k'^^kg{X)\Xk], (3.6) 

where ^-k\k is the dispersion matrix of X_k\k and V-kg = {gi, ■ ■ ■ ,gk-iY- Thus, 

E[Yar{g{X)\Xk)] < EE[w^k\kiX)Vl,,g{X) ^ ^k\k'^^kg{X)\Xk] 
= K[w_k\kiX)V\giX)^_,\„V_kg{X)] 

= E[w_kik{X){E-=^ ^^\k9fiX) + 2 Ei<.<,<fc-i '^^Jlkg^iX)g,{X))] .(3.7) 
Prom (3.2), via (3.5) and (3.7), we have that 

Var5(X) < E[w{X)algl{X) + Eti'[^(^)^i«ic5, + w_kikiX)al,]gf{X) 
+2Ei=i w{X)alakCi\kgi{X)gk{X) 

+2 T,i<i<j<k~i[wiX)alalciikCjik + w_k\kiX)aijik]giiX)gj{X)] . 
After some algebra (see Appendix B) it follows that 

Yarg{X)<E[w{X){Eliafgf{X) + 2j:,^^^^^^,a,,g,{X)g,{X))] 
and (3.1) is proved. 

Consider the function g{x) = po + Ei=iP«^«- One can easily see that (3.1) holds as 
equality. Conversely, assume that (3.1) holds as equality. Then (3.3), (3.4) and (3.6) hold 
as equalities. From the equality in (3.6), under the inductional assumption, it follows that 
g{x) = i?o(3;fc) + Ei=i Qi{xk)xi. From the equality in (3.4) we have that the quantity gk{x) + 
'^k Ei=i^ c^kgii^) is a constant in xi, . . . , Xk-i- Combining the above relations it follows that 

the quantity Akgo{xk) + YliZi[^kQi{xk)]xi + ak Yli=i Ci\kQi{xk) = Yli=i[^kQi{xk)]xi + h{xk) 
is a constant in xi, . . . , Xk-i- Therefore, AkQi{xk) = for alH = 1, . . . , A; — 1, that is Qi{xk) = 
Pi, i = I, . . . , k—l, are constants. Thus, g{x) = Qo{xk)+'^iZi PiXi- Finally, from the equality 
in (3.3) it follows that the quantity E{g{X)\Xk = Xk) is a linear function in Xk- Moreover, 
E[g{X)\Xk = Xk) = E{Qo{Xk) + T!IZI P^Xi\Xk = Xk) = Qo{xk) + Y!IZI pMXi\Xk = Xk). 
For both cases the quantity Ei=i^ PiE{Xi\Xk = Xk) is a linear function of Xk- Hence, Qo{xk) 
is a linear function of Xk, i.e. Qo{xk) = po + PkXk, and the proof is complete. □ 
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A The discrete multivariate covariance identity and some useful properties 

Let X a fc-dimensional random vector with probability mass function p supported by a "convex" set C*^ C M*^ 
such tliat = (0, . . . , 0)' e C*" (in tlie sense the if a; = (xi, . . . ,Xk) € C''' then {0, . . . , xi} x • ■ • x {0, . . . , x^} C 
C''). Assume that the mean /i and the dispersion matrix ^ of X are well defined > 0) and consider the 
vector of linear functions 

q{x) = {q\x),...,q''{x)y := ^''x. (A.8) 
Then the ly-function of X is well defined for every x G C'' by 'w(x) = (^^(a;), . . . ,w'^{x)'j with 

w^ix)p{x) = EJIoIm' - g'(a:,,^,)]p(a;,,^,), (A.9) 

where /i' = TEq^{X) and Xx^^j = {xi, . . . , Xi-i, j, Si+i, . . . ,Xk) for i = 1, . . . ,k (see CacouUos and Papathana- 
siou [11], Papadatos and Papathanasiou [21]). 

CacouUos and Papathanasiou [11] established the identity 

CoY[q\X),g{X)] = Ew\X)g,(X), (A.IO) 

provided that E\w' {X)gi{X)\ < oo and E|(g*(X) ~ fi')g{X)\ < oo, i = 1,2, . . . , k. 

B Necessary algebra for the proof of Theorem 3.1 

Multinomial case: 

k 1 k 1 

„. , 11, 2 , ■<r^ (n - Xi:)TVi{l - TTi - TTk) , V"^ -{u - Xk)-niH j {u - Xk)TlOT^i 

First, we calculate Ci|i, = 0-^1^+ Oi^^k = jr— h 2^ " 



Hence: 

w{x)alalc^k + w-k\k{x)cT^k 

"-"ELi^" /I N (l-7i"fc)^ (n - a;fe)^7rg7r,^ (1 - vrfe)(n - J^*^^ a;^) (n - a::fc)7ri(l - tt, - Tr^) 

- mvk[l~TTk)—f — 7:; h 

„i „„y> TTiTTfe + 1 — TTi - TTfe / \ 2 

H ^ I = 'w{x)n7Vi = w(x)nni{l - tt,) = w(x)a^ , 

n-ITo ^l — TVk 1 - VTfe l—TVk 



nno 7rg(n-Xfc)2 (1 - TTfc)* (n - a;fe)7ro (1 - vrfe) 

n ~ Yl''s=i^s ,mrf-Kk n-Ki{l — -Ki — -Kk) _ , . TTivrfe + 1 — tt; — vife 



/ \ 2 ^ ^ n N-Cl^'^fe) (^^ - a;fe)7ro7ri 
«;(a;)(jfeQfcCi|fc = ™(a;)n7rfc(l — TTfc) — -. r — ^ — = w(x){^-n-Ki-Kk) = w(x)Gik, 

TT0(n - Xk) (l-TTk) 

w{x)alalc^kCj\k + w^k\k{x)oij\k 

_n-Yl'l^^Xa , . (l-TTfc)^ {n- XkY-lvlTliTlj {1 - T^k){n -Y!1=1^s) -{n- Xk)'IVi-Kj 



-mvk{l - T^k)-^, vz 

n — Ylt^i ^1 fn-KiTTj-Kk n-Ki-Kj 



nno 7rg(n-Xfc)2 (1 - vrfe)" {n-Xk)no (1 - Trfe)^ 



niTO 1 — TTfe 1 — TTfe 

Negative Multinomial case: 



) = w(a;)(-n7ri7rj) = w{x)ai 



We calculate Ci\k = o-in, + > a^j^k = , „ ^o h > ^ , „ so = ,q , a \2 - Hence: 

w'(a;)crfcQfeCi|fe + TO_fen,(a;)(T^'n, 

_ eoir + J2Li^^)r9kieo + ek) {Oo + ekf jr + Xkfe^ (go + 6)fc)(r + ELi^^) {r + Xk)er{do+d.+9k) 

r el [r + XkY {Bo + ekY [r + Xk) [Oo + OkY 

= r ^ 91(00 + Ok) + eoioo + ek) ^ = ^ ' 

, . 2 , .rek(eo + ek)eo + ekir + Xk)e, re,9k , . 

01 r + Xk (Oo + OkY Oq 



6 



^ doir + E'^i Xs) r9k{eo + 9^) (gp + gfc)' (r + Xk)^e,e, {00 + 9k){r + ELi ('' + ^k)eie, 
r eg (r + ifc)2 (eo + Sfe)* ^ (r + Xk) (^0 + ^^)2 

= r ( + ^;;(^;;tm^ ^ '"(^^^^ — = 
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